On the combinatorics of crystal structures. II. Number of Wyckoff sequences of a given subdivision complexity

The number of Wyckoff sequences of a given subdivision complexity is calculated by means of a generating polynomial approach and a dynamic programming approach. The result depends on the choice of space-group symmetry (which is obligatory) and Wyckoff sequence length (which is optional). It also takes into account specified values for the total number of combinatorial and coordinational degrees of freedom, thereby representing crystal structures of invariant subdivision complexity.


Introduction
Any standardized crystal structure can be conveniently related to a descriptor uniquely encoding its combinatorial properties, namely its Wyckoff sequence: a string composed of the space group type number (sometimes the Hermann-Mauguin symbol is used instead) and followed by all the Wyckoff letters for each partially or fully occupied Wyckoff position in the crystal structure; the letters are put in reverse alphabetic order and augmented by their superscripted frequency of occurrence, in case a certain non-fixed Wyckoff position is occupied multiple times.
Standardization is necessary because many crystal structures do have equivalent descriptions in terms of their unit cell and atomic coordinates, depending either on matters of possible unit-cell choices or the symmetry properties of space groups [keywords: symmetry of symmetry, Cheshire groups, Euclidean normalizers; see Mü ller (2013), ch. 8]. A comprehensive scheme for crystal structure standardization has been theoretically developed by Parthé and coworkers (TYPIX) and implemented into the software STRUCTURE TIDY (Parthé & Gelato, 1984, 1985Gelato & Parthé, 1987;Parthé et al., 1993a,b). The uniqueness of the Wyckoff sequence thus depends on and follows from the uniqueness of the standardization.
The Wyckoff positions of a space group encompass all possible distinct sets of symmetry-equivalent sites within a unit cell. Or, to put it in rigorous mathematical terms: a Wyckoff position of a space group G consists of all points X for which the site-symmetry groups SðXÞ are conjugate subgroups of G (see Hahn, 2005, ch. 8.3.2, p. 733;Aroyo, 2016, ch. 1.4.4.2, p. 62).
It is important to distinguish the notions of space group and space-group type. A space group encompasses all of the symmetry of an actual crystal structure, including its translation symmetry, specified by its lattice parameters. A spacegroup type is an abstract notion comprising all those, otherwise distinct, space groups which share the same representation by matrix-column pairs of symmetry operations (Nespolo et al., 2018). This is of special importance in the study of group-subgroup relations, in which a group and one of its proper subgroups can be of the same space-group type, but not of the same space group. Another consequence is that the number of space groups is infinite, whereas the number of space-group types is finite: 230 distinct ones in three dimensions, including 11 enantiomorphous types. In particular, the Hermann-Mauguin symbol represents the space-group type, while concepts related to the Wyckoff positions, such as the Wyckoff sequences studied in this work, are related to the space group, in the sense that they are inseparably connected with the coordinate description of crystal structures. However, the number of distinct Wyckoff positions is considered to be finite, with a total of 1731 positions for all space groups, thus allowing for the combinatorial analysis to follow.
The Wyckoff positions are labelled by up to 27 possible Wyckoff letters (a to z and ), being devoid of any further meaning, but depending on the choice of space group. Since the Wyckoff sequence composed of these letters does not include any specific information about either the unit-cell parameters or the atomic coordinates, it is a more abstract notion as well, and useful as such for purposes of crystal structure classification and systematics (see, e.g., Allmann & Hinek, 2007). Note that this abstraction means that actual crystal structures can share the same Wyckoff sequence, while being distinct with respect to the specific values of the free parameter(s) of their non-fixed Wyckoff position(s). Such crystal structures are called isopointal (Lima-de-Faria et al., 1990).
Yet, Wyckoff sequences can be studied without referring to specific geometric crystal structures. Thus, any Wyckoff sequence really describes an infinite family of crystal structures, sharing a common parametrization in terms of their geometric degrees of freedom, according to the observed partial or full occupancy of their general position(s) and, if present, special position(s) of their corresponding space-group type. Indeed, the Wyckoff sequence has been used in the aforementioned sense, as a coordinate-free representation of crystal structures in modern machine learning approaches to materials discovery (Goodall et al., 2022).
One particular advantage of this abstract, combinatorial point of view is due to the fact that the number of Wyckoff sequences of given length k is finite (Hornfeck, 2022a), making an exhaustive study possible for small values of k. In fact, most of the actual crystal structures found so far in nature have Wyckoff sequences of length below k = 50. For the spacegroup type Pmmm (No. 47) with = 19 non-fixed sites and ' = 8 fixed ones, constituting the case with the highest number of possible sequences, this would correspond to about 3:5 Â 10 18 distinct sequences to consider in total, up to the length of k = 50 inclusive (see Appendix A for the computation and the exact result).
Another advantage, and one focus of this work, is due to the fact that the Wyckoff sequence translates into the information about a crystal structure's combinatorial (M) and coordinational (A) collective degrees of freedom, associated with a weighted sum of each Wyckoff position's individual multiplicity (M i ) and arity (A i ), respectively, the latter being the number of independent coordinate parameters required to be specified in a standardized description of an actual crystal structure.
Both the combinatorial and coordinational degrees of freedom can be used to assess a crystal structure's combinatorial and coordinational complexity by means of an approach pioneered by Krivovichev (2012Krivovichev ( , 2014 and extended by Hornfeck (2020Hornfeck ( , 2022b based on the utilization of the Shannon entropy as a complexity measure. Taking a general point of view, the collective degrees of freedom represent a certain system (a macrostate), while the individual degrees of freedom each represent a certain subsystem (a microstate). This is true on different levels of hierarchy. For instance, on the crystal structure level, the collective configurational degrees of freedom, M þ A, result as the sum of the individual combinatorial (M) and coordinational (A) degrees of freedom. In a similar way, on the Wyckoff position level, the collective combinatorial and coordinational degrees of freedom, M or A, result as the sum of the individual combinatorial and coordinational degrees of freedom, M i or A i , respectively.
In combinatorial terms this splitting of a system into subsystems corresponds to an integer partition. However, in their crystallographic application, the number of partitions is not simply given by the number-theoretic partition function, yet restricted by crystallographic symmetry and the coupling of combinatorial and coordinational degrees of freedoms as found within each Wyckoff position. In particular, the possible values of the Wyckoff multiplicities considering all space-group types are restricted to the set M i 2 f1; 2; 3; 4; 6; 8; 12; 16; 24; 48g (assuming primitive unit cells, i.e. modulo centring translations, which, in the following, will be the preferred choice of description), while the possible values of the Wyckoff arities are restricted to the set A i 2 f0; 1; 2; 3g. Moreover, for a given choice of space-group type, the values of either set that can occur are restricted further (although not their frequencies of occurrence, except for fixed positions, which can occur only once), in accordance with the existing Wyckoff positions. A final restriction is imposed by each Wyckoff position introducing a coupling of values from both sets. The combinatorial problem under consideration thus is one of counting the number of restricted, coupled partitions.
Thus, due to these particular restrictions, it is a non-trivial task to describe a macrostate defined by the collective degrees of freedom ðM; AÞ by the composition from or the subdivision into its associated microstates ðM i ; A i Þ. Foremost, it is a natural question to ask, given one macrostate, what is the number of microstates corresponding to it?
The following section will give the answer to this question, with the main results being a generating polynomial approach (Section 2.4, theory; Section 2.7, algorithm) and a dynamic programming approach (Section 2.8, theory and algorithm).

Combinatorics of Wyckoff sequences
To state the problem succinctly: how many distinct Wyckoff sequences (of optionally fixed length k) exist, given a pair of combinatorial and coordinational degrees of freedom ðM; AÞ together with a choice of space group? (Note that a space group has to be fixed, since this determines the alphabet of Wyckoff letters which can appear in the Wyckoff sequence.)

A problem of crystals -exposition
The problem statement can be translated rather straightforwardly into some algebraic form. Note that two integer values exist for each individual Wyckoff position i, its multiplicity M i and its arity A i (coordinational degree of freedom). Both have to be considered in a coupled way, which will be done in the notation by the use of column vectors. Then, the total multiplicity and arity, M and A, respectively, corresponding to a certain set of Wyckoff sequences of crystal structures, are given as Here, the sum index i runs over all the n existing Wyckoff positions for a given space group, with the integer multipliers i denoting the frequency of occurrence of a given site in the sum of individual multiplicities M i and arities A i of the combined sites. In one general point of view, as mentioned in the Introduction, one can call the pair ðM; AÞ the system variables (describing the macrostate), while the M i and arities A i would be the subsystem variables (describing the microstates).
Information about the Wyckoff positions of the 230 threedimensional space-group types is compiled in Vol. A of the International Tables for Crystallography (Aroyo, 2016), which is the authoritative source. Alternatively, it can also be retrieved online from the Bilbao Crystallographic Server (https://www.cryst.ehu.es/) using the routine WYCKPOS (Aroyo et al., 2006a(Aroyo et al., ,b, 2011. The values of the Wyckoff multiplicities M i are explicitly stated as the numeral part of the Wyckoff symbol assigned to each Wyckoff position (the non-numeral part is given by the Wyckoff letter). Note, however, that these values are given for the centred unit cells, in which case M should also be specified with respect to the unit-cell content of the centred unit cell, in order to maintain the correct correspondence. Alternatively, the values of the Wyckoff multiplicities can be reduced by the division of an integer factor depending on the centring type (2 for C, A and I centring; 3 for R centring; 4 for F centring), if the primitive unit cell, and the number of atoms it contains, is taken as a reference. Indeed, choosing the primitive unit cell as a reference is strongly recommended in the context of crystallographic complexity calculations (cf. Section 3), in order to make the results of these calculations comparable for crystal structures differing in their centring type, since any existing centring translation just repeats parts of a crystal structure inside a unit cell, thereby contributing no additional information to its description, and, accordingly, its complexity.
The values of the Wyckoff arities A i , although not explicitly stated in the aforementioned sources, can be deduced from them, namely by means of visual inspection with respect to the number of positional variables x, y and z occurring in the listing of the general coordinates as provided for each Wyckoff position.
In general, this information can be obtained for any crystallographic space group, including higher-dimensional ones, namely by inspecting the number of symmetry-equivalent positions modulo translations, to obtain the multiplicities, as well as by elucidating the dependency of their general coordinates with respect to symmetry, to obtain the arities. In fact, all this information can be obtained in a purely algorithmic manner [see Brown et al. (1978) for the case of four dimensions]. Geometric interpretation for the problem of finding the number of ways of combining individual Wyckoff positions of given multiplicities and arities, ðM i ; A i Þ, adding up to a given total multiplicity and arity ðM; AÞ. In the illustration, the target vector ðM; AÞ ¼ ð12; 3Þ, here written in row form, is denoting a point in the two-dimensional integer lattice Z 2 (square lattice) in the upper-right corner. On the top, the individual vectors ðM i ; A i Þ corresponding to each Wyckoff position are shown: a (2,0) red, b (2,0) green, c (2,1) blue, d (4,1) dark red, e (4,2) dark green, f (8,3) dark blue. On the bottom, their combinations adding up to ðM; AÞ ¼ ð12; 3Þ are shown, with vectors composed in reverse lexicographic order. Other possible combinations, in which only the order of the vectors are changed, are not shown. However, all lattice points which can be reached by any possible combinations of vectors are highlighted as open circles instead of filled ones. To see the full graph one has to invert the depicted half of it in the point (6, 1.5). In this interpretation the problem becomes a special case of a lattice path enumeration problem with the set of steps governed by the Wyckoff multiplicities and arities for a given choice of space-group symmetry.
It should be noted that the problem at hand can be given different mathematical interpretations and representations. Equation (1) already suggests a geometrical interpretation as two-dimensional vectors, which happen to live on the twodimensional integer lattice Z 2 (square lattice). The problem then appears as a special case of a lattice path problem (Fig. 1), in which the vectors associated with the Wyckoff positions of a space group define the set of steps.
Alternatively, the two-dimensional plane can be identified with the complex plane of Wessel, Argand and Gauß suggesting a change of notation, in which i is the imaginary unit (i 2 ¼ À1): While these interpretations and representations are mathematically equivalent, and thus do not seem to make any difference per se, the knowledge of these alternatives can be of importance when it comes to searching for subfields of mathematics discussing already existing solutions to a given problem, or general methods to find them, and also in the case of implementation into computer code.

Conditions on the multipliers
For any given site the multipliers i are restricted to discrete intervals f i;min ; i;min þ 1; . . . ; i;max À 1; i;max g ð 3Þ of potential integer values, which are denoted in shorthand as ½½ i;min ; i;max in the following. Naturally, by definition, the frequencies of occurrence are bounded from below, likewise for all sites, by all minimal multipliers i;min ¼ 0, meaning that a site is absent in this case. The variable upper bounds, the maximal multipliers, are determined according to the case distinction differentiating between non-fixed and fixed sites. Collecting the terms for the non-fixed and fixed sites separately, the summation of equation (1) can be split into separate parts, in which and ' now denote the total number of non-fixed and fixed sites, respectively (compare Hornfeck, 2022a). In any case, the repetition r i is given as with bÁc denoting the floor function, and the smaller ratio restricting the number of possible occurrences of the corresponding site from above. Stated in a different way, each multiplier for any non-fixed site has to fulfil both of the conditions simultaneously, with i;max being defined as the largest integer doing so. These conditions originate from the fact that the Wyckoff multiplicities and arities are non-negative integers; thus any surpassing of either one of the limits M or A cannot be balanced by the addition of another Wyckoff position, hence cannot be a part of the solution, and thus signifies an end to the Wyckoff sequence construction process. This guarantees the existence of maximal values i;max and fixes the size of the search space of which the solution space is a subspace.
While the determination of the size of the solution space, the number of Wyckoff sequences existing for a given choice of ðM; AÞ, is our main task, the determination of the size of the search space appears as a first step towards a result, in that it gives a numerical upper bound for the size of the solution space, a combinatorial overview about the Wyckoff sequences to expect, with respect to their length, as well as an estimate regarding the computational tractability of their actual construction based on an exhaustive exploration of the search space.

Size of the search space
For non-fixed sites the multiplier intervals are given as ½½0; r i ¼ f0; 1; . . . ; r i g, with variable r i , while for the fixed sites the multiplier intervals are given as ½½0; 1 ¼ f0; 1g, invariably. The Cartesian product over all interval sets of multipliers of either type determines the search space: This corresponds to the set of all possible Wyckoff letter sequences, being Wyckoff sequences without the space-group type symbol/number prefix explicitly stated, as constructed from the multiset ½ 1;max 1 ; 2;max 2 ; . . . ; n;max n , in which the i denote a general Wyckoff letter out of n possible letters for a given space group. Note that the choice of space group, while defining the alphabet of Wyckoff letters and limiting the number of terms to consider in the Cartesian product, does not determine the size of the search space by itself. This size, the cardinality of the search space, in which individual solutions have to be found, if they exist, is determined by the values of the maximal multipliers. Thus, the size of the search space is determined only if both the space group and the total numbers of degrees of freedom M and A are fixed. Then, the search space size gives an absolute upper bound on the potential number of solutions, yet usually, and in anticipation of our following results, the actual number of solutions will be much lower or even zero. This difference in the number of solutions is due to the construction of the search space by means of the Cartesian product, namely because the restrictions imposed by the i;max values for individual Wyckoff positions do not take into account their cumulative, conditional interactions.
As is often the case in combinatorial problems, the same cardinality jSj can be obtained by an alternative counting method, namely as the result of the summation of all coefficients in the expansion of the univariate polynomial defined by This is the generating polynomial for a multiset with finite multiplicities [compare its description and, in particular, equation (4) in Hornfeck (2022a)]. Upon its expansion the coefficients c k for each term x k of this polynomial count the number of sequences of length k, thereby representing a more differentiated view of the search space's contents. Eventually, in which ¼ 1;max þ 2;max þ . . . þ n;max is the multiset's cardinality.
The size of the search space gains some importance due to the fact that the combinatorial approach we will describe in the remainder of this work is non-constructive, as is commonly the case for such combinatorial questions, since it gives only the number of potential Wyckoff sequences matching with a given parameter pair ðM; AÞ, but does not reveal the Wyckoff sequences themselves; these can be discovered by an exhaustive check of all admissible multiplier combinations existing within the combined multiplier intervals.
We envision that the search space size can be reduced to some degree by applying effective intermediate checks for multiplier combinations already violating the upper limit as imposed by the choice of the parameters ðM; AÞ, overshooting either one parameter at a time or both simultaneously, possibly in combination with the use of a clever data structure such as pruned trees. However, search space sizes below jSj < 10 6 are easily tractable on a standard desktop personal computer, which should encompass most tasks related to the comparison of the potential combinatorial solutions with Wyckoff sequences representing actual crystal structures.

A generating polynomial approach to find solutions
Our combinatorial problem stated above is solved in two steps: first, by reducing it, conceptually, to an analogous classical problem of combinatorics, the coin change problem, as can be found in many textbooks on the topic (for instance, Marcus, 1998, p. 89), and second, by adapting the classical problem to the crystallographic one.
The classical coin change problem is stated in Appendix B, and can be seen as an illustration of the use of generating polynomials, defined in some abstract variable. Notably, the variable is an indeterminate symbol only, entailing no specific meaning other than to allow algebraic operations performed on it; hence it merely acts as a placeholder and bookkeeping device, yet is the decisive one, in order to systematically find a solution.
This use of generating polynomials has already been described in the first entry of this series (Hornfeck, 2022a) to which the reader, interested in more detailed information, is referred. An introduction to the wider field of generating functions is given by Graham et al. (1994), and a more detailed exposition of the main ideas involved is given by Wilf (2006). Now, adapting the classical coin change problem to our crystallographic one is carried out in three steps: (i) matching the number of distinct types of coins with the number of distinct Wyckoff positions; (ii) identifying the values of distinct types of coins with the pair of values of the Wyckoff position's multiplicity M i and arity A i ; and (iii) treating the pair of values (M i , A i ) in a coupled way, by introducing two abstract variables ðx; yÞ in the generating polynomials, instead of one.
Thus, in some way, the crystallographic problem is a coin change problem with a twist, based on imaginary coins with denominations on both their front and back sides and a pair of target values to reach upon summation. Similar combinatorial problems arise for the case of real cards, coupling values denoted by digits or letters with symbolic ones such as diamonds, hearts, spades and clubs.
In particular, the third adaptation step is crucial for the correct enumeration. As a consequence of it, generating polynomials of the kind are assigned to each Wyckoff position, with their product over all n Wyckoff positions, yielding the solution, namely by the value of the coefficient of x M y A in the expanded form of the polynomial Pðx; yÞ.
As an aside, one can note that the product over all n sites can be split, according to the contributions of non-fixed and ' fixed sites [compare equation (5)]. This splitting reduces the problem for the fixed sites to a univariate one. It should be noted that this approach can be further generalized, in principle, by taking into account chemical degrees of freedom [as introduced by Hornfeck (2020)] in terms of atomic numbers of atoms occupying a given Wyckoff position as well. Then, one would have to introduce a third variable z into the respective polynomials with all other procedures considered to be analogously performed. However, there is a difference, in that the atomic numbers are not restricted in any way, that is to say there exists no natural coupling between them and the Wyckoff multiplicities and arities -they are a pure matter of choice. In contrast, the Wyckoff multiplicities and arities are coupled in their values for each Wyckoff position of a space group. Thus, regarding this relative arbitrariness of choice and the relative unimportance of this general case, we refrain from expanding in this direction for the moment. However, our opinion on this topic might change in the future, if there should be an interesting application for fixing the total chemical degrees of freedom, that is the total electron count within a reduced unit cell, to a set value, together with the other degrees of freedom, and asking for the number of crystal structures fulfilling this condition. In this case, any generalization required can be obtained in a straightforward manner by following the same extension procedure as described in the following.

A generalization including the Wyckoff sequence length
Another and considerably more useful generalization is made by taking into account the length k of the Wyckoff sequence as a further restriction, which can be seen as an extension and a refinement to a previous result in the combinatorics of Wyckoff sequences (Hornfeck, 2022a). This can be achieved by adjusting equation (1) in addition to used for the determination of the i;max for each individual site i. Trivially, each site itself is of length k i ¼ 1 (thus, bk=k i c ¼ k), which gives the total length of the Wyckoff sequence as the sum of the multipliers i : k ¼ 1 þ 2 þ . . . þ n . Again, the final result is given as the value of the coefficient of x M y A z k in the expanded polynomial in which defines the individual terms.
As was the case before, the product over all n sites can be split according to the contributions of non-fixed and ' fixed sites. Finally, it should be noted that there exist general methods for obtaining explicit formulas for the coefficients of generating functions, for instance for the powers of bivariate generating functions (Kruchinin et al., 2021), as this is an active field of mathematical research.

A problem of crystals -exemplification
In the following, an illustrative example is discussed in full calculational detail.
The tetragonal space-group type P42 1 m (No. 113) encompasses a total of six distinct Wyckoff positions, here written in a more convenient in-line notation, indexed according to their Wyckoff letters, and with their corresponding multiplicities and arities stated: The chosen space group is the one with the smallest number of Wyckoff positions for which all possible arity values A i ¼ 0; 1; 2; 3 are present. The six Wyckoff positions comprise a total of three and four distinct values for the Wyckoff multiplicity and arity, respectively, thereby allowing us to illustrate the combinatorial calculation in some detail. Now, with some arbitrary chosen M ¼ 12 and A ¼ 3 given for this particular example, this results in the following maximal multipliers: for each Wyckoff position, and consequently in the following generating polynomials: Note how the two fixed sites with Wyckoff letters a and b give rise to exactly the same polynomial factor in one variable x only (namely the one used for the Wyckoff multiplicities), and how the polynomials corresponding to the non-fixed sites are restricted in terms of the monomials of highest degree in either x or y by the given values of either M or A or both. Note also that the number of variables in each polynomial corresponds to the splitting of all n sites into contributions from the non-fixed sites (bivariate case) and ' contributions from the fixed ones (univariate case) [compare equation (5)].
In fact, the bivariate polynomial given in equation (24) contains the information about all possible solutions ðM; AÞ which can be constructed from the multiset of Wyckoff letters ½a; b; 3c; 3d; e; f with restricted multiset multiplicities, as determined by the maximal multipliers. Since the multiplier intervals are ½½0; 3 for both the Wyckoff positions c and d, as well as ½½0; 1 for all other sites, the search space has a size of 4 2 Â 2 4 ¼ 256 cases, whose distribution according to the length k of the sequence can be read off from the expansion (11 terms; not shown) of the generating polynomial since i;max ¼ 1 for four out of the six Wyckoff sites and i;max ¼ 3 for the remaining two. The observed six solutions with the property ðM; AÞ ¼ ð12; 3Þ constitute only a tiny fraction of the search space, which is generally true, the size of the search space typically being overwhelmingly larger than the number of solutions.
In particular, the full set of solutions contains the singular occurrence of the empty sequence, f g, corresponding to the trivial monomial x 0 y 0 ¼ 1, as well as that of the maximal length sequence, abcccdddef, corresponding to the monomial x 34 y 11 of highest degree, for which M ¼ 34 and A ¼ 11. Note how one can instantly check that there exists no solution for, say, the case M ¼ 7 and A ¼ 1, because no monomial of the form x 7 y appears in Pðx; yÞ, or, to put it another way, the coefficient of this monomial in the expansion of Pðx; yÞ is equal to zero. Note also how one specific solution, namely solution d 3 , corresponds to the occurrence of the monomial x 12 y 3 , the j ¼ 3 case of the basic monomial x j 4 y j 1 representing the W d ð4; 1Þ Wyckoff position, in the generating polynomial P d ðx; yÞ of equation (23).
Taking into account the length k as an additional parameter, one has to adjust the individual polynomial terms according to P a ðx; zÞ ¼ 1 þ x 2 z; P b ðx; zÞ ¼ 1 þ x 2 z; P c ðx; y; zÞ ¼ 1 þ x 2 yz þ x 4 y 2 z 2 þ x 6 y 3 z 3 ; P d ðx; y; zÞ ¼ 1 þ x 4 yz þ x 8 y 2 z 2 þ x 12 y 3 z 3 ; P e ðx; y; zÞ ¼ 1 þ x 4 y 2 z; P f ðx; y; zÞ ¼ 1 þ x 8 y 3 z: Expansion of their product yields a polynomial in 142 terms (not shown), with, for instance, the value of the coefficient for x M y A z k ¼ x 12 y 3 z 4 being equal to 3, corresponding to the Wyckoff letter sequences which form a subset of the six aforementioned solutions.

Summary of the generation function approach
To summarize the aforementioned results in an algorithmic form, using the generating polynomial approach one has to perform the following steps to obtain a solution (with the first two items in the enumeration defining the input to the algorithm and the last one its output): (i) Fix a space group and thereby the set of Wyckoff positions and the potential alphabet of Wyckoff letters from which a Wyckoff sequence can be formed.
(ii) Fix the values for the total Wyckoff multiplicity M and the total Wyckoff arity A and (optionally) the Wyckoff sequence length k, all of which are expected to be integers.
(iii) Retrieve all the individual values for the Wyckoff multiplicities and arities W i ðM i ; A i Þ.
(iv) From the individual Wyckoff multiplicities and arities compute the individual maximal multipliers i;max .
(v) From the individual maximal multipliers construct the individual generating polynomials, P i ðx; yÞ or P i ðx; y; zÞ.
(vi) From the individual generating polynomials build their product, Pðx; yÞ or Pðx; y; zÞ, and expand it.
(vii) From the expansion read off the coefficient for the monomial of the form x M y A or x M y A z k ; this is the solution.

A dynamic programming approach to find solutions
Finally, after illustrating the aforementioned combinatorial method on an explicit example, we want to highlight the possibility of an alternative algorithmic way of calculation which turns out to be very efficient, by making optimal use of the recursive nature and overlapping substructure of the problem, in such a way that subsolutions are only ever computed once and retrieved as needed for the calculation of the solution. This algorithmic way is known under the name dynamic programming. A simple example for the classical univariate and infinite coin change problem is given in Appendix B. In the following, we show the adapted pseudocode for the bivariate and finite coin change problem in its crystallographic application, thus taking into account the case distinction for the multipliers of non-fixed and fixed Wyckoff sites (Fig. 2). The adaptation to the trivariate and finite case including the length k as a parameter follows a straightforward procedure (Fig. 3), as would be the case for any future multivariate generalization, for instance also taking into account chemical degrees of freedom.
In either case, the approach starts with the trivial base case (there is always one possibility for the null tuple), from which it proceeds bottom-up making use of a tabulation implementation of subsolution values cached into either a matrix (bivariate case) or a tensor (trivariate case), which is updated iteratively until the target case is reached and the solution is returned. Equation (29) shows the solution matrix T for the case defined by the target tuple ðM; AÞ ¼ ð12; 3Þ and the multiset of tuples W ¼ ½ð2; 0Þ; ð2; 0Þ; ð2; 1Þ; ð4; 1Þ; ð4; 2Þ; ð8; 3Þ in which the values for M i and A i increase along the rows and the columns, respectively. Note that for a more economic display, the matrix is shown in its transposed form, with columns and rows interchanged, such that its upper-left corner represents the matrix element T 0;0 and the lower-right corner represents the matrix element T 12;3 , from which the solution, T 12;3 ¼ 6, can be read off: A direct comparison shows that all non-zero entries represent the non-vanishing coefficients as occurring in the polynomial expansion given in equation (24), yet with terms only up to the solution x 12 y 3 monomial inclusive. Note that the algebraic structure of the Wyckoff positions is reflected in the matrix as well, since the reachable positions are determined by the smallest increments and their parity as observed for the Wyckoff multiplicities and arities. For instance, since all the Wyckoff multiplicities are even numbers, all odd columns of the above matrix are given by the null vector. A Python implementation of the algorithm for the bivariate case is given in Appendix C, which also contains a Mathematica implementation of the generating polynomial approach. Dynamic programming algorithm (given in pseudocode) for the determination of the number of Wyckoff sequences of a given space group, subdivision complexity and length as determined by the total number of degrees of freedom for the Wyckoff multiplicity, Wyckoff arity and length (trivariate case). Note that k i ¼ 1 for all Wyckoff sites -this does not have to be specified for each Wyckoff position independently. However, possible simplifications due to this fact have not been included in the pseudocode in order to highlight its similarity with the bivariate case shown in Fig. 2.

Figure 2
Dynamic programming algorithm (given in pseudocode) for the determination of the number of Wyckoff sequences of a given space group and subdivision complexity as determined by the total number of degrees of freedom for the Wyckoff multiplicity and arity (bivariate case).

Complexity of Wyckoff sequences
The aforementioned problem of crystals arose in the context of calculating Shannon entropy based complexity measures for crystal structures, taking into account a crystal structure's fundamental chemical, combinatorial and coordinational degrees of freedom (Hornfeck, 2020). Apart from its chemical degrees of freedom -its decoration (colouring) of Wyckoff sites with atoms -a crystal structure is geometrically defined by its combined combinatorial and coordinational degrees of freedom, the collective multiplicities M and arities A of its occupied Wyckoff positions.

Shannon entropy based complexity measures
In general, a given multiset ½X 1 ; X 2 ; . . . ; X n of n individual degrees of freedom X i defines a discrete probability distribution where X ¼ X 1 þ X 2 þ . . . þ X n denotes the collective number of degrees of freedom as obtained from the partition of the individual numbers of degrees of freedom. From this a Shannon entropy can be obtained. Note that Lð0Þ ¼ 0, by definition. As mentioned before, the general interpretation is that of a system with X degrees of freedom on a higher level of structural hierarchy being subdivided into n subsystems of X i degrees of freedom, each on a lower level of structural hierarchy. For more details of the general theory, the reader is referred to previous work done by the author (Hornfeck, 2020(Hornfeck, , 2022b. In particular, a crystal structure's M collective combinatorial degrees of freedom are associated with the multiset ½M 1 ; M 2 ; . . . ; M n of individual Wyckoff multiplicities M i , thereby yielding a fundamental combinatorial Shannon entropy:  (2020)]. Now, in the same way, a crystal structure's A collective coordinational degrees of freedom are associated with the multiset ½A 1 ; A 2 ; . . . ; A n of individual Wyckoff arities A i , thereby yielding a fundamental coordinational Shannon entropy Now, proceeding in a completely analogous manner, a crystal structure's F ¼ M þ A collective configurational degrees of freedom are associated with the combined multiset of individual Wyckoff multiplicities M i and individual Wyckoff arities A i , corresponding to the individual configurational degrees of freedom F i , thereby yielding a composite configurational Shannon entropy While the association with the combined multiset is natural, it is noteworthy to mention that this does not mean that the corresponding entropies are (simply) additive; on the contrary Most importantly, however, by means of the strong additivity property of the Shannon entropy, the configurational entropy is equivalent to the general expression including appropriate weighting factors w M ¼ M=ðM þ AÞ and w A ¼ A=ðM þ AÞ and an additional subdivision (mixing) complexity based on them, This equivalence now allows an assessment of the relative complexity related to the splitting of a given collective number of degrees of freedom, M þ A, for the composite system, into individual numbers of degrees of freedom, M and A, for the fundamental (sub)systems. The subdivision complexity H subdiv is a Shannon entropy of the weighting factors occurring in equation (37) and for all the combinatorially enumerated cases it is an invariant characterizing the subdivision step ðM þ AÞ ! ðM; AÞ on the higher crystal structure level of hierarchy. It can be seen as connecting the collective treatment of degrees of freedom, within the combined configurational complexity H conf , with the individual treatment of degrees of freedom, within the separated combinatorial and coordinational complexities, H comb and H coor , respectively. This allows for a complexity partition analysis (Hornfeck, 2022b). This central importance of the subdivision complexity H subdiv and its relationship to the other complexity measures is graphically depicted in Fig. 4.
In the same manner, other subdivision complexities exist on the lower Wyckoff position level of hierarchy, characterizing the subdivision steps ðX 1 þ X 2 þ . . . þ X n Þ ! ðX 1 ; X 2 ; . . . ; X n Þ, with X denoting either M or A.
Furthermore, it should be noted that the strong additivity property also holds for the case of maximal Shannon entropies where H subdiv takes on the same value as in equation (37). The maximal entropies are reached in the case of maximal subdivision and thus perfect equidistribution of degrees of freedom (corresponding to maximally expanded partitions 1 þ 1 þ . . . þ 1 of variable length equal to the number of degrees of freedom). This makes their calculation particularly simple, resulting eventually in H comb;max ¼ log 2 M, H coor;max = log 2 A and H conf;max ¼ log 2 ðM þ AÞ ¼ log 2 F. Each maximal entropy can also be used in turn to define its corresponding non-maximal entropy, for instance by means of taking the difference between the maximal entropies for the collective (here, F) and individual (here, F i ) degrees of freedom, the latter ones attributed with their appropriate weighting factors F i =F.
All the aforementioned interrelations are depicted in Fig. 5.

Combining the combinatorics and complexity of Wyckoff sequences
Some elementary statistical results shall be stated about the magnitude of the collective degrees of freedom to expect, in extreme and on average, and with respect to actual crystal structure data as retrieved from the 20 040 unique Wyckoff sequences compiled in the Pearson's Crystal Data Crystal Structure Database for Inorganic Compounds (Villars & Cenzual, 2020). The minimal observed combinatorial and coordinational degrees of freedom are, as determined for the individual, independent distributions, M min ¼ 1, A min ¼ 0, the maximal are M max ¼ 5926, A max ¼ 1476, the mean values are M mean = 94:3, A mean ¼ 53:1, the median values are M median ¼ 48, A median ¼ 20, and the mode values are M mode ¼ 24, A mode ¼ 6, respectively.
Our combinatorial result now tells us how many of these partitions, being solutions to a combinatorial problem restricted by crystallographic symmetry, can be realized for a given space group, given the fixed number of degrees of freedom ðM; AÞ and, potentially, the length k of the Wyckoff sequence. In the following we will illustrate this by performing explicit calculations for three examples.
It should be noted that the examples given here were chosen such that the discussion of the results could be made explicit, with the full set of potential Wyckoff sequence solutions being listed, and with some actual representatives being found among the known crystal structures. In general, the number of potential solutions might exhibit a rapid growth crystal lattices Acta Cryst. (2023). A79, 280-294 Hornfeck and Č ervený On the combinatorics of crystal structures. II 289 Figure 5 Schematic representation of the interrelation between various maximal and non-maximal entropies (combinatorial, coordinational, configurational) and the subdivision entropy on different levels of structural hierarchy. On the Wyckoff position level of hierarchy the difference of (maximal) entropies defines the conventional non-maximal entropies (from left to right), while on the crystal structure level of hierarchy the (partially weighted) sum of entropies defines the maximal and nonmaximal configurational entropy (from bottom to top). Note that the distinction between differences and sums of entropies depends on a deliberate choice of how to distribute terms on either side of the equals sign. In the scheme as presented here, the case for differences of entropies is based on the choice of collecting all maximal entropies together, while distributing collective and individual contributions on opposite sides of the equation could highlight the fact that the non-maximal entropies fulfil the same role as a subdivision complexity on the Wyckoff position level as the subdivision complexity H subdiv does on the crystal structure level, thus highlighting the applicability of the strong additivity property on both levels of hierarchy (in this case the common weighting factors w M or w A can be omitted on the Wyckoff position level of hierarchy).
(combinatorial explosion), while the number of actual representatives lags behind greatly.
4.1. Example 1: space-group type P42 1 m (No. 113) revisited As our first example, we revisit the case of space-group type P42 1 m (No. 113). Table 1 compiles some values of specific Shannon entropies for the Wyckoff sequences given in this example, taking into account the combinatorial, coordinational and configurational degrees of freedom, highlighting the constant term H subdiv , with the other Shannon entropies related according to their strong additive sum: A search in the Pearson's Crystal Data Crystal Structure Database for Inorganic Compounds (Villars & Cenzual, 2020) for the space-group type P42 1 m (No. 113) reveals a total of 759 Wyckoff sequences for which crystal structure prototypes have been assigned. Reducing for multiple entries arising from multiple crystal structure determinations sharing the same prototype yields 79 entries with a unique Wyckoff sequence/ prototype combination. (Notably, 538 entries share the same Wyckoff sequence 113 fe 3 ca, with seven distinct prototypes associated to it, of which the tP24-Ca 2 MgSi 2 O 7 prototype alone occurs 307 times, thereby explaining the heavy degree of reduction observed.) Reducing again for distinct prototypes sharing the same Wyckoff sequence yields 63 unique Wyckoff sequences.
On the basis of this work, one obtains a larger number of 44 possible Wyckoff letter sequence solutions [based on the Wyckoff multiplicities and arities, compare equation (21)], which can be constructed from an exhaustive exploration of a search space of much larger size of 11 648 multiplier choices (representatives with prototypes marked by an asterisk): f 2 e 3 ; fe 4 d; e 5 d 2 ; f 2 e 2 c 2 ; fe 3 dc 2 ; e 4 d 2 c 2 ; f 2 ec 4 ; fe 2 dc 4 ; e 3 d 2 c 4 ; f 2 c 6 ; fedc 6 ; e 2 d 2 c 6 ; fdc 8 ; ed 2 c 8 ; d 2 c 10 ; Ã fe 4 cb; e 5 dcb; fe 3 c 3 b; e 4 dc 3 b; fe 2 c 5 b; e 3 dc 5 b; fec 7 b; e 2 dc 7 b; fc 9 b; edc 9 b; dc 11 b; Ã fe 4 ca; e 5 dca; fe 3 c 3 a; Ã e 4 dc 3 a; fe 2 c 5 a; e 3 dc 5 a; fec 7 a; e 2 dc 7 a; fc 9 a; edc 9 a; dc 11 a; e 6 ba; e 5 c 2 ba; e 4 c 4 ba; e 3 c 6 ba; e 2 c 8 ba; ec 10 ba; c 12 ba: While the combinatorial approach is non-constructive, it might often suffice to only know the number of possible solutions.

4.2.
Example 2: space-group type Fm3m (No. 225) As another example, we consider the cubic space-group type Fm3m (No. 225) encompassing a total of 12 distinct Wyckoff positions of the following multiplicities and arities: Note that the multiplicities stated here are those for a reduced (primitive) unit cell, while the multiplicities for an F-centred unit cell would be four times larger. Again, all of the four possible values for the Wyckoff arity are present in this example, in addition to seven distinct values for the Wyckoff multiplicity. This example was chosen because it partly corresponds to actual crystal structures. Now, for the given choice of ðM; AÞ ¼ ð28; 3Þ degrees of freedom one finds 17 matching Wyckoff letter sequences: if 2 ; hf 2 ; Ã gf 2 ; f 2 ed; Ã ifec; Ã hfec; gfec; Ã fe 2 dc; Ã ifeba; hfeba; gfeba; fe 2 dba; f 3 cba; ie 2 cba; he 2 cba; Ã ge 2 cba; Ã e 3 dcba: Here, the seven sequences preceded by an asterisk are realized as crystal structures. A compilation of their complexity values, which are interrelated according to is given in If one performs the same calculation with the additional restriction of the Wyckoff sequence length to the value k ¼ 5 one obtains the number of four solutions, which the reader can easily check on the list given above.

Example 3: space-group type Cmcm (No. 63)
As a final example, we consider the orthorhombic spacegroup type Cmcm (No. 63) encompassing a total of eight distinct Wyckoff positions of the following multiplicities and arities: Note that the multiplicities stated here are those for a reduced (primitive) unit cell, while the multiplicities for a C-centred unit cell would be two times larger. Again, all of the four possible values for the Wyckoff arity are present in this example, in addition to three distinct values for the Wyckoff multiplicity. Now, for the given choice of ðM; AÞ ¼ ð22; 10Þ degrees of freedom, one finds 67 matching Wyckoff letter sequences: hg 3 c; hg 2 fc; hgf 2 c; Ã hf 3 c; g 4 ec; g 3 fec; Ã g 2 f 2 ec; gf 3 ec; f 4 ec; hg 2 c 3 ; hgfc 3 ; hf 2 c 3 ; g 3 ec 3 ; Ã g 2 fec 3 ; gf 2 ec 3 ; f 3 ec 3 ; hgc 5 ; Ã hfc 5 ; g 2 ec 5 ; gfec 5 ; f 2 ec 5 ; Ã hc 7 ; gec 7 ; fec 7 ; ec 9 ; g 5 b; g 4 fb; g 3 f 2 b; g 2 f 3 b; gf 4 b; f 5 b; g 4 c 2 b; g 3 fc 2 b; g 2 f 2 c 2 b; gf 3 c 2 b; Ã f 4 c 2 b; g 3 c 4 b; g 2 fc 4 b; gf 2 c 4 b; f 3 c 4 b; g 2 c 6 b; gfc 6 b; f 2 c 6 b; gc 8 b; fc 8 b; c 10 b; g 5 a; g 4 fa; g 3 f 2 a; g 2 f 3 a; gf 4 a; f 5 a; g 4 c 2 a; g 3 fc 2 a; g 2 f 2 c 2 a; Ã gf 3 c 2 a; Ã f 4 c 2 a; g 3 c 4 a; Ã g 2 fc 4 a; gf 2 c 4 a; Ã f 3 c 4 a; g 2 c 6 a; gfc 6 a; Ã f 2 c 6 a; gc 8 a; fc 8 a; c 10 a: Here, the 11 sequences preceded by an asterisk are realized as crystal structures. A compilation of their complexity values, which are interrelated according to is given in  Table 2 Configurational complexities for the seven possible Wyckoff sequences (of length k) with 28 combinatorial and three coordinational degrees of freedom of space group type number 225 which have realizations as crystal structures in nature.

Conclusion
As a contribution to the combinatorics of Wyckoff sequences, we have presented two methods to calculate their number for a fixed space group, given a pair of combinatorial and coordinational total degrees of freedom, and, optionally, their length. The first method is based on a generating polynomial approach (see Sections 2.4 and 2.7 for the key results), while the second makes use of a dynamic programming algorithm (Section 2.8). While the generating polynomial approach appears to be conceptually easier to understand, the dynamic programming algorithm is considerably better in its computational performance. The methods have been exemplified on cases of ideal and actual crystal structures with invariant subdivision complexity and variable configurational complexity in the sense of Hornfeck (2020), thus relating the combinatorics of Wyckoff sequences to the complexities of crystal structures.

APPENDIX A Number of Wyckoff sequences of length k
The number of Wyckoff sequences of length k is given as in which and ', respectively, denote the number of non-fixed and fixed Wyckoff positions of a given space group [equation (7) gives the number of distinct Wyckoff sequences up to k ¼ 50 inclusive.

APPENDIX B
The classical coin change problem Assume you are in the possession of the following multiset of coins: three coins of value v ¼ 1, four coins of value v ¼ 2, two coins of value v ¼ 5. How many ways exist to pay a total price of value V ¼ 12? Let S v denote the individual sum of coins of value v, hence The values which each individual sum may take are determined and limited in number by the number of available coins of each nominal value. Thus, one gets S 1 2 f0; 1; 2; 3g; S 2 2 f0; 2; 4; 6; 8g; S 5 2 f0; 5; 10g ð 52Þ as all possible values. Now, the trick is that one identifies these values with the exponents of a couple of generating polynomials, one for each type of coin, namely P 1 ðxÞ ¼ 1 þ x þ x 2 þ x 3 ; in which x 0 ¼ 1 by definition. From these individual polynomials one forms their product PðxÞ ¼ P 1 ðxÞP 2 ðxÞP 5 ðxÞ ð 54Þ and analyses its expansion 1 þ x þ 2x 2 þ 2x 3 þ 2x 4 þ 3x 5 þ 3x 6 þ 4x 7 þ 4x 8 þ 4x 9 þ 4x 10 þ 4x 11 þ 4x 12 þ 4x 13 þ 4x 14 þ 3x 15 þ 3x 16 þ 2x 17 þ 2x 18 þ 2x 19 þ x 20 þ x 21 : ð55Þ The coefficient of x 12 in PðxÞ gives the desired solution, here it is equal to four, describing the four cases (i) 2 Â 1 þ 2 Â 5, (ii) 1 Â 2 þ 2 Â 5, (iii) 3 Â 1 þ 2 Â 2 þ 1 Â 5 and (iv) 1 Â 1 þ 3 Â 2 þ 1 Â 5, all amounting to 12. A generalization is possible, in case the number of coins available for all given values v is assumed to be infinite. Then, the generating polynomials are replaced by the respective generating functions of a geometric series, and the solution to the problem is given by their product in the same way as before, namely as the value of the coefficient ½x V in F ðxÞ, in which V denotes the total price. In the infinite case the result is ½x 12 F ðxÞ ¼ 13, necessarily containing at least as many or even more solutions than were found in the finite case. In this general form the problem has been addressed by Pó lya (1956). A closed form solution for the univariate infinite case has been described by Graham et al. (1994, pp. 327-330 and pp. 344-346). Note that is the generating function of the number-theoretic integer partition function (Alfonsín, 2005, p. 71), and our problem asks for the number of restricted partitions into specified, possibly repeated parts (a finite or infinite number of coins with a finite set of distinct denominations). In this context the number of non-negative integer solutions is known as Sylvester's denumerant (Sylvester, 1857) which has a rich history in number theory, partly due to the difficulty of obtaining exact results (Alfonsín, 2005, ch. 4). Finally, it should be emphasized that an elegant method of solving this problem in practice is given by the concept of dynamical programming. Since the problem again and again reduces to simpler subproblems of the same kind, the solu-tions of these subproblems can be stored whenever they occur for the first time. Moreover, using a bottom-up approach (topdown would be equally feasible), the solutions for subproblems can be systematically generated from the already known solutions. For this purpose the function here given in a Python implementation is initialized with a result vector of length n þ 1 of the form ð1; 0; 0; . . . ; 0Þ. This result vector is subsequently updated in a separate loop for every coin value. The final value at position n of the result vector then is the solution to the (infinite) coin problem described before (for n ¼ 12 the result is equal to 13).

APPENDIX C Code examples
Figs. 6 and 7 give working examples for the calculation of the number of Wyckoff sequences with the same subdivision complexity by means of the generating polynomial and dynamic programming approaches. The input values are taken from the example described in Section 2.6, with the output being equal to six.